Scalable Mesh Networks 
and 
The Address Space Balancing Problem 


Andrea Lo Pumo 
Girton College 


a UNIVERSITY OF 
CAMBRIDGE 


A dissertation submitted to the University of Cambridge 
in partial fulfilment of the requirements for the degree of 
Master of Philosophy in Advanced Computer Science 


University of Cambridge 
Computer Laboratory 
William Gates Building 
15 JJ Thomson Avenue 
Cambridge CB3 0FD 
UNITED KINGDOM 


Email: al565@cl.cam.ac.uk 
May 31, 2010 


Declaration 


I Andrea Lo Pumo of Girton College, being a candidate for the M.Phil in Advanced 
Computer Science, hereby declare that this report and the work described in it 
are my own work, unaided except as may be specified below, and that the report 
does not contain material that has already been used to any substantial extent for 


a comparable purpose. 


Total word count: 14980 


Signed: 


Date: 


This dissertation is copyright ©2010 Andrea Lo Pumo. 


All trademarks used in this dissertation are hereby acknowledged. 


Abstract 


Mesh network architectures are reliable and efficient. They maximize the network 
throughput with multiple paths and adopt alternative routes when a component 
fails. Moreover, network applications can optimize their performances by exploit- 
ing updated routing informations. 

Large scale versions of mesh networks are attractive both for ISPs, as a mean to 
lower the management cost of their infrastructure, and also for communities, as 
they can build and sustain city-wide wireless networks without requiring any third 
party support. 

Hierarchical routing protocols are natural candidates for implementing scalable 
mesh networks. However, when the network is dynamic, the hierarchical topology 
must be reconfigured after each event. In order to reduce the installation and 
management costs of a hierarchical mesh network, we propose distributed proto- 
cols for automatically creating and maintaining the routing architecture. Also, we 
derive a set of rules for solving the address space balancing problem, we study 
their behavior under different network conditions and evaluate their performance 
as the network becomes larger and more dynamic. We find that, in the worst case, 
the number of address changes is upper-bounded by O(N ), but in a network with 
a constant churn, the number of reconfigurations increases at least linearly as N 
grows. 
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Chapter 1 
Introduction 


Mesh network architectures are reliable and efficient: every node acts as an inde- 
pendent router and when a path becomes broken due to a link or a node failure, 
the network automatically adopts alternative routes. Moreover, nodes can increase 


their throughput by exploiting the presence of multiple paths. 


Thanks to the high availability of low-cost wireless devices, Wireless Mesh Net- 
works (WMN) are becoming the prevalent form of mesh networks. Their appli- 
cations are numerous and include broadband home LANs, security surveillance 
systems and metropolitan area networks for transportation systems[!]. WMNs 
are also effective for extending Internet access to remote rural areas|4] and for 
combating the digital divide. Among the various application scenarios, commu- 
nity WMNs spanning one or more city neighborhoods are the most interesting. 
In fact, they support almost all kinds of common network services, such as web 
servers, multiplayer games, file sharing systems and VOIP, and they can be viewed 
as a localized small scale version of the Internet. Currently, two major commu- 
nity WMNs are the Athens Wireless Metropolitan Network[2], which reached 2000 
nodes in 2008, and the Berlin Freifunk([3). 


A large scale version of a mesh network would not only be similar to the current 
Internet, it would be much better. In fact, in a mesh network each node is a router 


and every network application can have access to and exploit the information 


regarding the routing infrastructure. Thus, applications would be able to make 
more informed decisions that would improve latency and throughput. For example, 
multiple idle paths could be simultaneously utilized for increasing the bandwidth 
between two nodes, or in the case of content distribution networks, the clients 
themselves could decide what is the nearest replica that is accessible through 
the least congested path. Routing informations would be particularly useful for 
distributed services like P2P applications: the virtual links of overlays could be 


directly replaced with physical optimal paths. 


However, with current routing protocols, mesh networks are still not ready for 
growing to truly large sizes. In fact, they face serious scalability problems: as the 
number of nodes grows the demand imposed on routers increases rapidly, until a 
point where they are forced to dedicate all their resources. As an example, in 2009, 
the mesh network of Internet autonomous systems comprised 350 thousands nodes 
and, in order to execute the BGP protocol, a router needed 400Mbytes of memory 
and a 1.1Ghz processor [32]. WMNs are even more sensitive: routers are generally 
small devices with constrained resources, f.e. Access Points with 32Mbytes of 
memory and a 200Mhz processor. Additionally, in WMNs the overhead caused by 


routing packets can heavily decrease the network’s throughput|[5]. 


A classic approach for solving the problem of routing scalability is to structure the 
network into a hierarchical topology. The aggregation induced by the hierarchy 
allows to achieve routing tables with small size and to reduce the routing update 
overhead. In this dissertation, our main concern will be the design of distributed 
protocols for automatically creating and maintaining a hierarchical routing archi- 
tecture. The automatic configuration of a hierarchical network presents several 
benefits: first of all, the network installation and management costs are greatly 
reduced, secondly the network can rapidly adapt to changes: nodes or links can 
be easily added or removed and in the case of a global attack or a system update 
the network can be quickly reconfigured from scratch. Furthermore, in the case 
of WMNs, the ability to automatically configure the network becomes a necessity: 
the network is dynamic, nodes may be added or removed frequently and the hi- 
erarchical constraints may affect other nodes and force their reconfiguration. A 


manual intervention will require a high cost and the response times will be too 


slow for guaranteeing an appropriate continuity of service. 


The dissertation is structured as follow: in Chapter 2, we will introduce the back- 
ground and related works concerning wireless mesh networks. In Chapter 3, we 
will derive a set of rules for solving the address space balancing problem and in 
the next chapter, we will describe the protocols for implementing the rules in a 
distributed way. Finally, in Chapter 5, we will evaluate the performance of the 
balancing rules and study their behavior as the network becomes larger and more 


dynamic. 


In the next section, we present a concise summary of the work undertaken in the 


course of the MPhil project. 


1.1 Methodology 


The fundamental research goal of the project was to verify to what extent hier- 
archical architectures are applicable to large dynamic networks, such as city-wide 
WMNs. To this end, we focused on designing distributed rules for automatically 


reconfiguring the hierarchy after network changes and on evaluating their cost. 


Initially, we decided to use some very simple and intuitive rules for obtaining a 
first understanding of the dynamics involved in the topology maintenance and 
for discovering their main issues. We wrote a high level simulator for aiding our 
study of the rules on small topologies. As a result, we collected different examples 
showing impossibility results and situations that were not correctly covered by a 


naive approach. 


The study of the simple rules helped us to exactly define our problem and to re- 
formulate it in a more general form. After reviewing the literature, we understood 
that it was strictly related to classic NP-complete problems of graph partition- 
ing and clustering. However, since we needed distributed and non-reset based 


protocols, we could not apply any known solution. 


Before proceeding further in the design of the rules, we theoretically investigated 


the problem space, trying to understand what were the fundamental limitations 


that constrained our design choices. Next, for analyzing the complex dynamics 
generated by rules, we independently studied the gnode split and the gnode sat- 
uration problems. Our main intuition for analyzing the gnode split problem was 
that a cluster could be approximated as a random graph! and that the expected 
number of reconfigurations could be derived from the size of its giant connected 
component. For the saturation problem, we analyzed the performance of the rules 


on simple topologies first and then we generalized the results on arbitrary graphs. 


For estimating the overall cost of the rules we derived a natural upper bound on 
the number of reconfigurations due to dynamic network events. Obtaining a lower 
bound was harder: the rules were too dependent on the underlying graphs and 
it was not possible to obtain a meaningful estimate by theory alone. Thus, we 
started to write a second simulator and designed an experiment for evaluating the 
rules under churn conditions. This time, writing the software was simpler: we 
proved that in order to get a lower bound it was sufficient to simulate the rules 
only on the first level of the hierarchy. This simplification allowed us, not only to 
get the lower bound, but also to study the behavior of the rules under different 


network conditions. 


Finally, writing the dissertation was a task in itself: we reassembled in a coherent 


form all the notes written during the project. 


this intuition was later confirmed by simulations 
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Chapter 2 
Hierarchical Networks 


In a hierarchical network, the nodes are aggregated in groups (or clusters). Each 
node knows a route to reach any node of its own group, but it does not store 
all the routes required for reaching outside nodes. Routing update packets are 
propagated as usual, but when they exit from a group they drop all its internal 
information. This solution provides a marked saving in the routing table size and 


in the overhead caused by routing updates. 


2.1 Background and Related Works 


Kleinrock and Kamoun|6] were the first to study the theoretical properties of 
hierarchical networks. They showed that the introduced stretch! is sufficiently 
small for networks where the average distance grows as N”, for a fixed v > 0 and 


where N is the number of nodes in the network. This is the case for wireless 


‘the stretch of a network measures the distortion caused by a non-shortest path routing 
protocol and is defined as is 


dr(a, y) 
rey d(@,y) 


where dr(x,y) is the length of the shortest path from x to y returned by the routing protocol, 
while d(x, y) is the length of the actual shortest path 


networks in a two-dimensional space[8]: when the node density is constant, the 


average distance is proportional to N 2, 


The Internet itself is a hierarchical network of two levels where clusters are rep- 
resented by Autonomous Systems, and —as observed by Krioukov [7]- almost all 
proposals for a clean-slate design of a scalable Internet architecture are based, 
sometimes implicitly, on the concept of hierarchical routing. Also in wireless net- 
works the only routing protocols that have been able to scale are based on hierar- 
chical concepts|9]. However, differently from wired networks, where the topology 
is designed a priori, in a wireless network new nodes may be added or removed over 
time and if the nodes are mobile, new links may be established or old ones may 
become broken. For these reasons, the hierarchical topology must be dynamically 
constructed and maintained. Moreover, since the routing addresses are automat- 
ically assigned a separation between the location and the identity of a node is 
introduced. The location of a node is a label used only for routing purposes, while 
the node’s identity is the actual name that network users will have to refer to for 


contacting the node. 


There are two main approaches for automatically configuring a hierarchical topol- 
ogy: defining clusters as neighborhoods of a restricted subset of nodes or creating 
clusters as groups of nodes of bounded size. The first approach is adopted by single- 
level or multi-level clustering protocols. In single-level protocols like LANMAR|10], 
particular nodes called cluster heads are selected and their neighborhood of radius 
r forms a cluster. Multi-level protocols[11],{12], go one step further by extending 
the hierarchy recursively: among the level / cluster heads, some are elected as level 
1+ 1 cluster heads. 


The second approach for constructing hierarchical topologies is adopted by DART{14] 
and Netsukuku[16]. Using a graph-partitioning protocol, nodes are merged in con- 
nected groups of bounded size S and, recursively, groups are merged into higher 
level groups. Since the size of a group is bounded, it is possible to associate to 
each node a routing address that requires a minimum amount of space, namely 
O(log N) bits. In comparison, the addresses assigned by multi-level protocols re- 


quire O(log” N) bits. No virtual links are defined between the groups and distance 


vector-like protocols are used for discovering pro-actively the routes. 


In multi-level protocols the main difficulty arises in electing and maintaining 
cluster-heads. When a cluster head dies, all the members of the cluster will have 
to choose another cluster head as their leader. This is particularly severe when 
the cluster head belongs to a high level, since all the nodes of that level will have 
to change their address and update their zdentity + location mapping. The traf- 
fic generated by mappings handoff has been estimated as the dominant overhead 


factor in multi-level cluster networks|[19]. 


In DART and Netsukuku the groups do not depend on the existence of a single 
node. However, in some situations a group may become saturated and new nodes 
will not be able to join the network. In [15], the authors pose the problem of 
designing a mechanism for avoiding the saturation of groups, but they leave it as 
an open problem, which we call the address space balancing problem. This is the 
focus of our project. In essence, it is necessary to balance the clusters each time 
an upper bound is imposed on their size. For example, in MMWN[12] the authors 
fix a preferred cluster size and for this reason they are forced to split a cluster in 


two when its size becomes too big. 


The hierarchical model that we will adopt assumes a multi-level topology with L 
levels and groups of bounded size S. The only constraint imposed on the logical 
topology is the connectivity constraint: all groups of all levels must be internally 
connected. This model is strictly related to the DART/Netsukuku hierarchical 
architecture, but it can be easily adapted to any other with connected groups of 


bounded size. 


We will now give a formal description of the hierarchical model. 


Definition 2.1.1. Let G = (V, E) be a connected graph representing the network. 
Suppose that the maximum number of nodes that can ever join the network is 
Ninax = 9", that is |V| < Nmax, with S,L positive integers and S > 2. Then we 
can assign to each node x an address % of the form 


UQ.L1... UL-1 


whereO<2,;<S—-1 W=1,...,L-1. 


For routing purposes, we can allow situations where a node has more than one 
address but we cannot assign the same address to different nodes. In other words, 
a proper address assignment is a partial surjective function a : S’ —> V. 


Define the following equivalence relation on V: 


YZ SF Yo = 21 


WHEE tp = Dj oakgaa 


that is, we are identifying nodes with the same /-suffix. We can represent the 
equivalence class [y]; of y as follow: 


Note that [ylo = {y} and ly]; =V. 


For each level / and address x, we can then consider the subgraph of G induced 
by the nodes of [x], which is is the subgraph of G formed by all the nodes that 
have the same I-suffix of x. We will call [a], the network at level | of x. 


By contracting the nodes of [a]; that have the same (/ — 1)-suffix we obtain the 
graph called the group node (gnode) of x of level /: 


g(x) = Vilx), Ex(x)) 
Vi(z) = [z]i/ Sa = {[yli-1 | YE [x]. } = {*.Yl-1-21 ...tp-1 EV | O<y1<S- 1} 
yl 4 € Exe) @ Fy € [yey ae € eli: yz’ EF 


In other words, we subdivide the network in groups and then we recursively 
proceed to subdivide each group (see figure 2.1). An alternative representation of 
the group nodes can be given using the language of trees: each g;(x) is a vertex 
of a tree T and the elements of g(a) are its children, or in other words, the nodes 
with an address of the form «x.y... yz—1 are children of the node *.yj41...yp—1.- 


We will continue to call an element x € g, a node, while we will call single nodes 
the elements of the original graph G. 


A group node g is of level | if there exists a node x such that g = g(x). We define 
Ivl(g) =1. The graph formed by all the gnodes of level / is: 


[|G]: = U [z]i44 


The links in [G]; are those induced by the single nodes, i.e. g,h € [G]; are linked 
if a single node x € g is linked to a single node y € h. With I;(g) we indicate the 


*1 * 2. *.3 


A \. 


+10 421434 es 423933 * 153 *.233 ¥:3,3 


wn wn We ON 


Figure 2.1: A hierarchical topology represented as a tree and as nested groups. 
The figure represents the first two highest levels (Z — 1, L — 2). The group size is 


neighborhood of g in the graph [G];: 


Ti(g) = {h € [Gl | g, h are linked} 


It will be sometimes useful to consider a further level L by fixing for all nodes z, 
Ly := 7, where 7 is a constant called network id. The group g;(x), called network 
group, is equal for all the nodes and contains all the group nodes of level L — 1. 


The size of level m < 1 of a group gq is the number of m-level groups contained in 


size;(gi) = 


sizez, (gi) = S size,(Y ) 


With size(g,) we indicate sizeg(g;), i-e. the number of single nodes contained in gq. 
Since m-level nodes in g;(x) have addresses of the form *.Ym-Ym41 +--+ Yl-1-U1 ++» UL—1, 
it follows that: size;,(g:) <.S'"™ and in particular size(q) < S'. 

We say that q is full if size(g,) = S". q is free if size(g)) = 0. 


Ifh = *.y...yzp—1, then all the gnodes *.y;...yz—1, with 7 > 1, will be called the 
higher gnodes of h. We define up(h) = *.y41---Yz—1. Analogously, we will talk 
of lower gnodes of h and, by abuse of notation, sometimes we will write y € h to 
indicate that y is any lower gnode of h. 


A node y € h is called a border node of h if it is linked to at least one node z € h’, 
with h’ 4 h and lvl(h) = lvl(h’). For example, in figure 2.1, the node *.2.2 is a 
border node of *.3. The set bnode(h, h’) contains the border nodes in h that are 
linked to h’. Notice that g,h € [G], are linked iff bnode(g, h) 4 0. 


We will say that an address assignment forms a valid hierarchical topology if for 
all levels the graph of each group node is connected. For this reason, we will call 
this requirement the connectivity constraint for group nodes. 


We will now proceed to describe the main components required for a complete 
implementation of the above hierarchical network architecture. Later on we will 
focus on the problem of constructing a valid address assignment for creating a 
self-configuring network. 


2.2 Routing 


The main benefit of the connectivity constraint comes from the following proposi- 
tion 

Proposition 2.2.1. When the network is full, the routing table of each single node 
contains at most LS = Slogg N entries. 


Proof: Before proceeding, we give the following definition: let x, y be two nodes, 
then 

hdl(x, y) = min {0 < l < L-1 < | L>l41 = Z>141} 
If | = hdl(z,y), then gi41(x) = gi4i(y) is the lowest gnode where both x and y 
belongs. In the language of trees, gj; is the nearest common ancestor of x and y. 


For all levels / = 0,1,...,£—1, and for each group g of level /, run a distributed 


route discovery algorithm on the graph of g, in such a way that at the end of the 
discovery, each route starting from a gnode g’ and contained in up(g’) is known 
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and stored by all the single nodes of g’. Moreover, for each 1 > 1 andh €T;(2x), x 
must also know at least one border node b € bnode(g;(x), h). 
With the above protocol a packet can be correctly routed to any destination: 
suppose z wants to forward a message to z. Let | = hdl(z,z). Both x and z 
belong to g = giii(%) = gisi(z). By the connectivity constraint, g is connected 
and there is a path (g(x), y,---,g(z)) in g connecting g(x) to g)(z). Since the 
routing protocol has explored all the gnodes of the networks, and in particular g, 
the route (gi(x), yz, 9(z)) has been discovered and x knows it. Now, the problem 
of routing the packet from x to z has been reduced to the problem of routing a 
packet from x to any node of the group y and then to z. To reach y, x will 
forward the message to its known border node b € bnode(q(x), yz). 
We now count how many entries a single node x = x,...xz,_ 1 stores in its routing 
table. With the above routing protocol, x stores a routing entry for each 1 <1 < L 
and y € *.2,...2,. When the network is full, each gnode *.7,...xz has S elements, 
thus the total number of entries becomes 
A i Upaet az) | 1 a <b, 0 < Yl-1 <S-1}| ae Bist 
The space required for storing the routing table is thus 
SL? logy S 

bits. Note that in order to mark a node as a border node, x needs only an 

additional bit. 


We now give some remarks on how to implement such a routing protocol. 


Any Distance Vector or Link-State routing protocol can be converted into a hier- 
archical version as follow: when a node x = x,...%,_ receives a route (2, y, z), 
where y is a neighbor of x and z is the destination, x will install the following 
entry in its routing table: 


gateway = y, destination = *.z).2j41... 21-1 
where / = min {0 < l < L-1 < | Z>141 = Z>i41} 


Distance Vector routing protocols do not require any further modification. In- 
stead, Link State protocols are more complicated to implement, as they require an 
appropriate definition for the weight of the virtual link that connects two groups. 


With a hierarchical topology it is also possible to prevent loops of flooding packets 
in a simple way: each time a node forwards a routing discovery packet, it appends 
its address at the end of the packet. A node will discard a packet if it finds its 
address in the appended list. There is no risk that the list will become too large: 
when the packet exits from a group node, all its internal addresses are discarded, 


bi 


i.e. when the packet contains a list of the form 


1 
* UV). Li41 +--+ LL-1, 


2 
* 0) -Ui41---UL-1 


m 
* 0) U4, - + UE-1 


* .Y1-Yl41-V142...UL-1 


it is rewritten to 


* U4. ..-LLE-1 


* .Yl-Yl41-T142---UL-1 


This means that once a packet exits from a group node, it will not return inside. 
A gnode can have a maximum of S nodes, thus its diameter is also bounded by S. 
It follows that in the worst case the list appended in a packet will contain (S—1)L 
entries. 


2.3 Hierarchical Distributed Hash Table 


The addresses of the hierarchical topology are used to encode connectivity infor- 
mation and are thus not arbitrary. For this reason, a separated mechanism is 
needed in order to give an identity to nodes. 


An easy way to solve the problem is to set up a classic Domain Name System 
where few single nodes function as DNS servers. A much better way is instead 
to exploit the hierarchical topology and build a Distributed Hash Table (HDHT) 
that will store the associations between names and addresses. 


We now describe how to construct a HDHT. Let S” be the address space, V; the set 
of nodes of the network at time t and a; : S’ —> V, the address assignment at time 
t. Let K be the key space, which we can assume to be larger than the address space 
(S’CK). The aim of a DHT is to maintain at each time t a function d,: K — D, 
where D is the data space, f.e. strings of few bytes. The function d; is distributed 
among the nodes of the network: each node stores in its memory a subset of d;, i.e. 
a small set of mappings {ky + di(ki), ko di(ke), ..., km to di(km)}. Also, d; 
varies through time: a node might request to change the mapping k + d(k) to 
k ++ d'. The basic idea for building the HDHT is to let the node a;(h(k)) store 
the mapping k + d(k), where h : K — S* is a hash function. Thus, in order to 
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retrieve such a mapping or to change it, the nodes will contact the node a;(h(k)). 
However, if the network is not full (V; € S$”), a; might be a partial function not 
fully defined on S$”, i.e. some addresses might not be assigned to any node. ‘This 
matter is solved with another dynamic function H, : S$” —+ dom(a;). Given an 
address x, H; returns another address H;(a) that has been already assigned to a 
node. H; can be implemented in a distributed way as follow: 


1. define H;(x) as the nearest address to x associated to an alive node: 
A,(x) = minargycdom(a:) abs(z — 2’) 


(note?) x, 2’ are considered as vectors and are compared using the lexico- 
graphic order where the most significant digit is the last one (az_1). abs is 
defined component-wise. 


2. anode does not need to have the complete knowledge of dom(a;), i.e. it does 
not need to know if for an address there’s a corresponding alive node in the 
network. A node y = yo... yz_-1, by looking at its routing table constructed 
with the routing protocol described above, knows what are the alive nodes 
of y Vl =1,...,L (note *). If y1 © y is an alive node, then there is an 
alive single node with an address of the form *.y;_;. Let dom,(a;) be the set 
of all the alive nodes known by y. 


When a packet has to be sent to H;(x), it will be routed to H;,(x) using a 
greedy algorithm: when y receives the packet, it forwards it to the group 
node with address 


AY (x) = minarg,/cdom,(az) Abs(x — 2’) 


In sum, the node associated to a key k is a;(H;(h(k)). 


Notice that a HDHT is more efficient than a classical DHT like CHORD[17]. In 
fact, in DHTs, the request is forwarded multiple times between nodes. For instance, 
CHORD requires O(log N) forwardings. Instead, a HDHT is built on top of the 
routing infrastructure of the network. Each read/write request is directly routed 
to the correct node, so that the number of required forwardings is 1. 


The latency for querying and updating a mapping can be further optimized by 
extending the HDHT: suppose the single node z wants to read or update a mapping 
k; ++ d, where k is the key and d is the data. Instead of querying directly the node 
h(k), z will do the following: 


there can be two addresses 2’, x” that minimize abs(x—<’), in this case we pick min {z’, x’’} 
3y7, = 7 is the network group 
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1. let h(k) = ho... hp_y and z= %... 2-1. 
2. z will query in order: 


ho.21 EEeePLl 


ho.hy.22 «6+ ZD-1 


ho.h4 see hr-1 = h(k) 


With the above schema, the HDHT is sliced in levels: a node will initially query 
nodes in its same group of level 1, then nodes in its same group of level 2 and so 
on, until it finds a result. Each time it goes up of one level, the destination node 
may be potentially located farther, in terms of routing distance, and vice-versa, 
finding an answer in lower levels may be more profitable. 
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Chapter 3 


Balancing the Address Space 


In this chapter, we will discuss how to construct and maintain a proper address 


assignment that structures the network in a hierarchical topology. 


3.1 Related Problems and Works 


The problem of constructing a proper address assignment from scratch is not 
easy, in fact, it is strictly related to the Bounded-Connected-Graph-Partitioning 
(BCGP(G, M,k)) problem: given a graph G = (V, E) and two integers M,k > 0, 
decide if there is a partition of its vertices V = VjU...UV,, such that 


l1m<M 
2. each component V; is connected 
Be oe | Se eS 1 et 


When & is fixed to 4, the above problem is called Bounded Component Spanning 
Forest (BCSF) and it is known to be NP-Complete[18]. 
Proposition 3.1.1. Given a graph G = (V,E) and S > 0, the problem of deciding 


if there is a proper address assignment a: S’ —+ V such that 
1. L=min{L>1||V|<s*} 


2. each node has no more than one address 
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3. the number of gnodes of level 1 1s no more than M 


is NP-hard. 


Proof: With an assignment as above and S = k, the nodes are partitioned in 
connected groups of level 1 and |g) < S =k Vg € [G]i. Thus, it is possible to 
decide if BCGP(G, M,k) is true or not. 


Different solutions and heuristics have been proposed for constructing a solution 
to the BCGP problem or one of its variations. The solution presented in [21] 
selects a spanning tree rooted at a random vertex. By traversing the tree from 
the leafs, vertices are aggregated in connected components. The authors show 
that in the case of Random Geometric Graphs their algorithm can achieve a small 
number of common vertices between two components. The distributed version of 
the algorithm works by creating the tree with flooding. Finally, their algorithm 
is reset based: there are some cases where it is necessary to rebuild the clustering 
from scratch. 

By requiring that the size of all components is almost the same (0 < |V; — V;| < 
1 Vi,7), the BCGP becomes closely related to the problem of Graph Partitioning, 
which has been extensively studied for applications such as VLSI circuit layout, 
image processing and matrix computations|22]. For generating an initial partition, 
graph partitioning algorithms generally resort to either spanning tree techniques 
or to graph growing. Graph growing algorithms initially form groups of size one 
by selecting a random subset of vertices (seeds), afterwards neighboring vertices 
are iteratively added, enlarging the groups. If a group becomes too large, the 


procedure is recursively applied to the group. 


Unlike the above works, in this project we are interested in an incremental, dis- 
tributed solution to the address assignment problem: as the network evolves the 
address assignment must be updated and the nodes must be able to change their 
address without having a global knowledge of the network. Also, as explained in 


Chapter 5, the update has to minimize the number of address changes. 
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3.2. Dynamic Balance 


In a distributed implementation of a self-configuring network, the nodes have to 
choose their own addresses. For this reason, from now on, we will view the task of 
finding a proper address assignment as an evolving distributed process. We will 
say that a node joins a group node gi(y) when it chooses an address x such that 
x ~, y. Analogously, a node can leave a group and can migrate from a group to 
another. Further, we say that x creates or allocates a gnode gy, if x joins g; and it 
is its unique node, i.e. g = {x}. 


The connectivity requirement for having a valid hierarchical topology is a strong 
constraint and it is the cause of the gnode split and of the address space balancing 
problems. 


The first problem arises while trying to maintain a proper address assignment. 
The removal of a node or a link may disconnect a gnode g, in multiple connected 
components g, = A,U...UA,, (gnode split). When this happens, the connectivity 
constraint is not satisfied anymore. See figure 3.1 for an example. 


( A “ ( A ) ' \A 
; a = a 
ve A \ A 
(anyAh_ » (« ' 
— v2 =, 7 } fo (s ) 
(a) B 
A ) .B 


Figure 3.1: The removal of a link disconnects the group node A in two connected 
component. The nodes of one component will have to change their membership 
by migrating into another gnode B. 


Notice that a split of a group *.g,...gzp—1 may induce a split of one of its higher 
enodes *.g...gz, — 1, with l/ > 1. This happens when g; is an articulation point! 
in Gi+1; Vi =l,...,l'/-1. 


Ta vertex x of a connected graph is an articulation point if there are two distinct nodes such 
that all the paths that connect them pass through x 
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The only solution to the gnode split problem is to promptly repair the address 
assignment: the single nodes of the components Ag,...,A,, are forced to change 
their address and to migrate into other groups. 


The address space balancing problem arises when a new node joins the network: 

Proposition 3.2.1. As a consequence of the connectivity constraint, there are 
some configurations where a new node cannot join to a gnode q, even if sizeo(gz) 
is not full. In this case we say that g, is saturated. 

Moreover, the address space of the whole network can be saturated with just (S' — 
1)L+1 nodes. 


Proof: Let g = *.y...yz—1 and suppose that size,(g;) is full, i.e. g; contains 

all the possible gnodes of level 1, or in other words, *.y/.y2...yz—1 has at least 

one node, VO < y’ < S—1. Suppose further that a gnode h = *.y}.y2...yz-1 is 

full. Finally, suppose that a node x is linked to nodes of h only. Since h is full, 

in order to preserve the uniqueness of addresses, x cannot join to h. Moreover, by 

the connectivity constraint, x will not be able to choose any other address of gy, 

i.e. it will not be able to join to any *.y.y2...yz_1, with O<y< S—1. 

The whole network gz can be saturated as described above, however we need much 

less than | size;(gz)| = S%~! nodes. In fact, we can saturate it as follow: 

1. First, turn on only S — 1 nodes and to each of them assign an address of the 
form *.y, with 1 < y < S—1. Ensure also that they are connected. 

2. Consider other S' — 1 nodes and ensure that they form a connected graph. Let 
them join the network, using an address of the form *.y.0, with 1 <y< S—1. 

3. continue recursively: *.y.0.0, *.y.0.0.0,..., until adding the nodes y.0.0...0. 

4. finally, add the node 0.0... .0 

In this network, a node x that is only linked to nodes of the gnode *.0.0...0, will 

not be able to join. 


In order to avoid network saturation, we need to balance the address space: the 
address assignment has to be updated over time in order to let any new node 
acquire a proper address if the network is not full. 


The requirement of having a valid address assignment restricts the choice of how 
a balancing protocol reconfigures the network. 

Proposition 3.2.2. Consider the situation described in Proposition 3.2.1, where 
there are no more free gnodes left and a node x is forced to join to a full group 
h. In this case, if there is a proper address assignment, then we have only two 
solutions: 


1. either a node migrates from h 


2. or a gnode g is emptied 
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With solution 1., x is then able to join to h, instead, with solution 2., x can re- 
create the gnode gq. 


Let Mz be the minimum number of migrations required for applying the solution 
2., and M, that required for applying only the solution 1., then we have: 


1. M, = min {length(P) —1| P is a migration path starting from h } 
where a migration path is defined in the proof below 
2. Mz >min{|g| | g gnode} 


Proof: Consider any new proper assignment and let y,,...,ys be the nodes in 
the group h of the old assignment. Let g(y;) = up(y;). In the new assignment we 
have two cases: either Vi, 7 9(y:) = g(y;), or not. 
In the former case, x has necessarily joined to a gnode g different from h. However, 
since x is only linked to nodes of h, by the connectivity constraint x is the only 
node of g. In other words, x has created g. By hypothesis, all the groups were 
not free, thus the old nodes of g have migrated, i.e. g has been emptied and x has 
re-created it. 
In the latter case, at least one node has migrated from h. 
The number of migrations required in the former case is at least |g|. It can be 
larger, f.e. if the migrations from g force other migrations. Thus, we have 

M, > min {|g| | g gnode} 


Call a path P starting from h a migration path if 
P=(p1,.--,Pm), Pi=h, 
Vi<m—1_ p; is full, p; is a gnode linked to pj41, lvl(p;) = lvl(h), 
Pm is not full, Ivl(pp,) > lvl(h) 

If solution 2. is forbidden, we do not have any other choice than to repeatedly 
apply solution 1.. That is, a migration path is selected and at least a node migrates 
from p; to p;41, for alli <m-—1. As we will see in Prop. 3.2.11, we can force the 
migration of exactly one node from p; to pj4;. Thus, the number of migrations 
required is length(P) — 1 and it is minimized by the shortest migration path. 


Remark 3.2.3. Between the two solutions presented in Prop. 3.2.2, we prefer to 
adopt the first, for two main reasons: 


1. The second solution forces all the nodes of a gnode g to migrate. This 
implies that some other gnodes will increase their size. As a consequence, 
the network may reach the saturation point quicker. 


2. In a distributed implementation where a group does not know the size of 
the other groups, the first solution requires less communication overhead: in 


19 


order to find a shortest migration path, the gnode h queries its surrounding 
gnodes using a BFS?-like exploration, which is stopped as soon as a shortest 
path is found. Instead, in the second solution, at least the size of all group 
nodes has to be discovered?. 
Remark 3.2.4. In Proposition 3.2.2, we described what are the necessary solutions 
for fixing a saturated network. However, instead of fixing the network at the last 
minute, we might try to avoid to fill up a group, unless it is strictly necessary, 
and try to always keep the network saturation-free. Also in this case, we do not 
have much choice on how to reach a new address assignment. In fact, we cannot 
predict in what group the new nodes will join, i.e. we must assume that any group 
can increase its size. Thus at some point, we must decide if the size of a group 
is too large, and force at least one node of the group to migrate. This means 
that we have to use a condition p(|g|) that depends on the size of the group g, 
and possibly on other parameters. When p(|g|) is true, a border node of g will be 
forced to migrate. The simplest condition is obtained by fixing 


P(lgl) = (gl > 3) 


In this case, a migration will occur only when g is full and a new node x joins. If 
x joins to g only when it is forced to do so, then this becomes the same solution 
1. of Prop. 3.2.2. Instead, by fixing 


p(|g|) = (Ah: |g| = [Al + 2) 


a migration will occur only if g is bigger than one other gnode*. This condition 
is the opposite of the previous one: as soon as possible a node will migrate. 


We will later see in more details the above two balancing rules. 


Notice that, in any case, the migration of a node from g can make p(|g|) false, but 
p(|h|) true, for some other gnode h. Thus, in order to avoid infinite back and forth 
migrations from g to h and from h to g again, migration paths become a necessity. 


Breadth First Search 

3We say “at least”, because finding the group g that minimizes the number of migration is 
not just a matter of knowing its size. In fact, in Prop. 3.2.2, min {|g| | g gnode} is a lower 
bound of Mo. 

4if size(g) > |h| +1 is used instead, a loop can occur: a node may endlessly migrate back and 
forth from g to h 
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They are redefined as follow: 


P isa migration path @ P= (pj,...,Pm), Pi =h, pj; is a gnode linked to p41 
Vi<m-—1_ p(|p;|) is true 
P(|Pm|) is not true 


Proposition 3.2.5. Suppose that all the lower gnodes of h are allocated, then it 
is possible to change the address of a node x € h, only if x is a border node of h. 


Proof: In fact, by the connectivity constraint, a node y € h linked only to nodes 
of h is forced to remain in h. Thus the only nodes that can migrate are the border 
nodes. The vice-versa does not always hold, because even if y is a border node, if 
all its neighboring gnodes are full, then it cannot join them. 


The migration of a border node b may affect the topology of the higher levels. For 
example, suppose that 6 is the unique border node in bnode(g,h). If 6 migrates 
to f #h, then g will become linked to f but will loose its link with h. In general, 
the following proposition holds: 

Proposition 3.2.6. Let g,h € |G|;. If a border node b migrates from g to h and 
|g| > 2, the resulting gnodes g',h' satisfy: 


Also, g may lose one of its links and up(g) may become split, but |G], remains 
connected. 


Proof: Since |g| > 2 and b € g, the connectivity constraint implies that 6 is 
linked to at least one other node x € g. Thus when b migrates, x will become a 
border node and |I';(g’)| > 1. 

Since the border node 6 is a new node in h’, it follows that |[;(h’)| > |Pi(A)|. 
Equality holds when h was already linked to all the neighboring gnodes of b. 
Finally, if h was the unique border node in bnode(g,h), then g’ is no more linked 
to h’. Thus, |'(g’)| < |['(g)]. 

Let [(g) = {hi,..., hm}, with h = hy. [G]; remains connected because a broken 
link (g,h;) can be replaced by the path (g,h1,h;). However, if hy ¢ up(g), then 
(g, hi, hi) is not a path contained in up(g). If this was the only path connecting g 
to h;, then up(g) becomes split. 


We have a result similar to the previous proposition in the case of gnodes migration: 
Proposition 3.2.7. Suppose that a gnode g of level | migrates, then the graph 
[G]i-1 does not change, and the new |G], is isomorphic to the old one. 


Pal 


Proof: This follows directly on how the gnodes migrate. Let g = *.g)...gz-1. 
When g migrates, it will assume another address of the form *.g)...g),_,, thus 
a node of level | — 1 will change its address from x = *.g)-1.g)...gp-1 to xv’ = 
*.91-1-9,---gi,_1. If a single node was a member of z it will still be a member of 
x’. In other words, the “inside” of gj; has not changed. Thus, the links between 
*.g-1 the others *.hj)_1 € [G]i_1 are still the same. (What could have changed are 
the links between *.g);; and another *.h;41.) 

Finally, [G], is isomorphic to [G"]; because their only difference is the name of the 
gnode *.g;, which has been changed to *.gj. 


It is not always possible to solve the address balancing problem, i.e. in some cases, 
some nodes of the network will not be able to join: 
Example 3.2.8. Not all graphs admit a proper address assignment. 


Proof: Consider the string topology formed by N = S” > 1 nodes, that is, if 
the nodes are v1,...,un, then vjvj4, Vi = 1,...,N—1 are all the edges. Now, 
attach a dangling node q to vg, i.e. usq is a link. Then this network does not have 
a proper address assignment, in fact, suppose by contradiction that it has one. 
First observe that since the network is full (NV = S$“), all the gnodes of any level 
are full too, i.e. all the addresses have been used and 

|g] = S Vg gnodes (1) 


This means that vg € g for some gnode g of level 1. Let 7 be such that 

9 Sain 14. 1 eS: |e gt 
Consider the case where 7 > 1. We have, 

U1,-++,Uj-1 ¢ g (2) 

Let HC {u1,...,vj;-1} be a maximal subset such that Vz,y: gi(x) = gi(y), that 
is all the elements in H are in the same level 1 gnode and all the other v; ¢ H are 
not. Since |H| < 7 —1< S, by (1) it follows that H cannot be a complete gnode, 
that is 
Jw¢g HH: gi(w)=gi(z) Vr € A 


H is by definition maximal > w ¢ {v,,...,v;-1} > w = Uj41 


it is not possible that w = v,;, otherwise gi(H) = gi(v;) = g and this is in contrast 
with (2) 
= W=Ujin, A> 0 

so, we have found a w which is in the same group of the elements in H, but is 
not linked to any of them. This contradicts the connectivity constraint. 

Consider now the case where j = 1. Since |g| = S, we have 

(Viet =o (3) 
Now recall that the dangling node q is linked only to vg, thus by the connec- 
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tivity constraint, the only gnode where it can belong to is g = gi(vg). But this 
contradicts (3). 

In any case, we have shown a contradiction. Therefore no proper address assign- 
ment is possible. 


Example 3.2.9. Reaching a valid assignment is sometime impossible through 
local address changes. 


Proof: Consider the case where a group g has only one border node b with a 
group h. Suppose also that 6 is forced to migrate to h. If b is an articulation point 
of g, then its migration will disconnect g. Suppose h is full, then one component 
of g might not be able to follow b and migrate to h and it will remain completely 
disconnected from the network. 


Definition 3.2.10. Clearly, the problems exposed in the above two examples can 
be solved at the cost of increasing the parameter S. Another workaround is to add 
new links: suppose a migrating node b splits a group g, then virtual links will be 
established between the old neighbors of b belonging to g. In this way, g remains 
connected. A virtual link between x,y € g is removed when x or y leaves g, or 
when a new link reconnects g. 


Creating virtual links in g can be still seen as changing the parameter S, but only 
locally to g: when the node b migrates from g to h, it assumes two addresses and 
belongs at the same time to g and h. In g, b is seen as a virtual node, with a non 
standard address of the form b! = *«.(S + k).gi41...gz—-1. The virtual node b’ does 
not act as a border node, i.e. it does not maintain links with nodes outside g, 
and when its neighbors leave g or g becomes reconnected, it is removed from the 
network. 


With the use of virtual links, the following proposition holds. 
Proposition 3.2.11. If one node migrates from a group, the group will not be 
disconnected and no other migrations will occur. 


Although, there are pathological cases where an arbitrary number of virtual nodes 
are added, we will later see in simulations that their use is rarely needed for random 
networks with DL = 1. 


3.3. Last Minute and Preemptive Balancing 


The two balancing rules presented in Remark 3.2.4 adopt two different strategies, 


1. Last-Minute: the addresses of some nodes are changed only if the network is 
saturated and a new node cannot join, 


23 


2. Preemptive: at each network event, the addresses of some nodes are changed 
so that a new node can immediately join without requiring a further recon- 
figuration of the network. 


The Preemptive strategy seems attractive because the network is constantly kept 
saturation-free, however it also requires a migration each time a new node joins, 
while the Last-Minute rule will force one migration only when necessary. 


We will now describe the Preemptive Balancing rule (PB-rule) and later the (LM- 
rule). The protocols for ensuring their distributed implementation are presented 
in Chapter 4. 


The PB-rule works as follow: 


1. At first fix ] = D—1, and iteratively apply the following procedure, lowering 
| by one each time it ends, until / = 1. 


2. For any gnode h of level | (h € [G],), let '(h) =, (A). 
3. If all the neighbors y € ['(h) are such that |h| < |y|, then stop. Otherwise, 


4. if there’s any neighbor y such that |h| > |y| + 2, then consider y’ s.t. 
y’ =minarg|y|, where y ranges in {y € T(h) | |h| > |y| + 2} 


let exactly one node migrate from h to y and stop. Otherwise, 


5. if there is a neighbor y € (A) s.t. |h| = |y| +1, then let g in [G]; be any of 
the nearest gnodes to h such that |h| = |g| + 2. If such a g does not exist, 
stop. Otherwise, let h = pj, p2,...,Pm = g be a shortest path connecting h 
to g. Notice that by definition of g, we have 


pi -l=po=-++=Ppm-1=Pmt1 


Finally, for each 7 = 1,2,...,m—1, let exactly one node from p; migrate to 
pis1. After the migrations, the new configuration will be such that 


|p1| = [pal = --- = [pm 


Remark 3.3.1. When the PB-rule has selected a migration path p1,...,pm, the 
migrations have to happen in the order py — p2, po — ps3,...,- Otherwise, 
suppose p;41 — Pj+2 happens before p; — pjs1, then the link p; — pj;i, could 
become broken if the unique border node connecting p;,, with p; has migrated to 
Pit2- 

Remark 3.3.2. Steps 3., 4. and 5. can be replaced by a single step: substitute in 
step 5. the condition |h| = |y|+1, |h| = |g] +2 with |h| > |y|+1, |h| > |g) +2. In 
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this way the migration path p,,...,Pm can be of length m =1,m=2 orm > 3. 
From a global point of view, we can restate the PB-rule as follow: for any level 1, 
find a shortest migration path with a smallest gnode at its end. 

Remark 3.3.3. The non-deterministic steps of the PB-rule (4. and 5.) can be made 
deterministic by applying different heuristics: 


1. suppose that the group g has more than one border node that can migrate. 
In this case, the border node whose removal does not induce a split of the 
gnode g is preferred. 


2. Suppose that x can join to more than one gnode gj,..., Gm. When it decides 
to join to g;, it becomes one of its border nodes and some routes may use 
xX as a gateway to reach nodes in g;. Thus, in order to reduce the latency 
stretch, the node zx will prefer to join/migrate to the gnode g; such that 
max,cy, d(x, y) is minimized. 
The main reason for using the Path Balancing rule is that it constantly keeps the 
network saturation-free by satisfying the following property: 


Proposition 3.3.4. When the PB-rule terminates at level l, 
Vg,h € |G], 0 < abs(|g] — |h]) <1 


Or in other words, the gnodes in |G]; have almost the same size. 
Proof: Consider the set of all the shortest migration paths in |G]: 
NonIncrPaths([G];) = {(p1,---;Pm) | pi is linked to p41 in [G];, |p;| > |pisi| Vi} 
MigrPaths([G],) = 
= {p € NonIncrPaths([G]i) | [pi] = [Piengtn(n)| + 2; 
length(p) = min length(NonIncrPaths([G],)) } 


If MigrPaths([G],) is empty, then 
Vg,h€ [Gli O<abs(|g)—|h])< 1 (1) 
is true and the PB-rule terminates. 
Suppose that MigrPaths([G],) is not empty. Define the imbalance of |G; as: 


T= So Ip) 


p€MigrPaths((G]1) 


where I(p) = max{p1 — Piength(p) — 1, 0} 
Notice that 
G finite = | MigrPaths([G];)| < 0o > I < co 
I(p) > 0 © |pi| = |Piengtn(p)| + 2 
I=0€ MigrPaths([G],) = 0 
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The PB-rule selects one path p € MigrPaths([G],) and, thanks to Prop. 3.2.11, it 

makes exactly one node migrate from p; to pj4, Vi =1,...,length(p) — 1. Thus, 
I(p) and I decrease. Since J is finite, eventually, it becomes 0. 
Corollary 3.3.5. If in the network there is a non full gnode in [G];, with 1 <I< 
L—1, then any new single node will be able to join. 


Proof: Suppose that the new node = is linked to a node y € h of level I’ < 1. We 
have two cases. In the first, h € [G]; is not full. Then, x can directly join to h by 
taking an address of the form *.a)_1.h;...hp_1.hy. The connectivity constraint is 
satisfied because x is linked to y and thus also to h; Wi >I’. 

In the second case, h; is full Vi > 1, but by hypothesis, there is a non full 
gnode g € [GI;. In this case, let « assume a temporary address of the form 
*.0)-1-h,...hp_1, with 2, > S (note®). This will change the size of h; to 
|hi| = [Ay] + 1 = S41. Since the PB-rule applies at all levels, when it termi- 
nates we have that 


0 < abs(|f| —|gl) <1 VF EIGhs_lfl<S vFelGh (2) 

lgl<s 
After the migrations x belongs to some f and by (2) it can can now assume a 
proper address, with 0 < 2)_1 < S—1. 


We now proceed to describe the Last Minute rule (LM-rule). We will later see 
that it is more efficient that the PB-rule. 

Consider the procedure of the PB-rule, then the LM-rule’s procedure is obtained 
by substituting the size | -| operation with the following: 


IXI| = i es 
(note®) In this way, we have 
JAl < S, |g =S, |b} =S+h = [Bll > Ilgll > lll 
and a gnode will apply the balancing rule only when it becomes full. 


The Corollary 3.3.5 still holds with the due changes, thus also the LM-rule avoids 
saturation. 
Remark 3.3.6. The LM-rule is a generalization of the PB-rule. In fact, if S in the 


5 
6 


we are temporary violating the constraint of using an address in {0,1,...,5 — ce 
we are improperly stating that the size of a gnode can become larger than S. What we really 
mean is that if a gnode X is full and h nodes want to join in X, then |X| = S+h. 
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definition of || - || is replaced by a parameter So, then with Sp = 0 we obtain the 
PB-rule. 


We end this chapter with the description of the rules that allow a new node to 
join the network. 

Definition 3.3.7. Suppose that a node zx is turned on. If x is now connecting 
multiple disconnected networks, then x will prefer to join to the largest one. x has 
two strategies for joining to a network: 


1. (Dispersive-rule) if there is a free gnode of level L — 1, then x will create it. 
Otherwise, let A, be the length of the migration path that is created by the 
balancing rules when x joins to g. Then, x joins to the neighbor g € I;(x) 
that minimizes \,, where / is the maximum level s.t. [G]; contains a non-full 
enode. 


2. (Aggregative-rule) even if there is a free gnode that x can create, x prefers 
to join to a neighboring gnode, as described above. 


Intuitively, the Aggregative rule is more costly because it moves the configuration 
toward saturation. We will later see that simulations confirm this intuition. 
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Chapter 4 


Distributed Balancing Rules 


The balancing rules can be naturally translated to a distributed version: the only 
information required for constructing a migration path is the size of the groups, 
which can be known from the routing tables of their nodes. Moreover, the migra- 
tion paths starting from a given group can be locally discovered using a BFS-like 
exploration. The BFS search does not flood all the single nodes because a group 
can be visited by selecting only one of its nodes. The discovery of migration paths 
is started by border nodes: when b € bnode(g,h) receives a routing update and 
notices that ||g|| has become too large (or too small) with respect to |/Al|, it will 
try to create a migration path and migrate. 


In the section below, we will see that the actions of multiple border nodes can be 
coordinated with atomic distributed locks. Next, we will describe how the nodes 
can repair a gnode split and how two separated networks can be merged once they 
become connected. 


4.1 The Memory of a Group 


There are various situations where events need to be serialized: 


1. when two or more nodes want to simultaneously join to the same group they 
cannot act independently, otherwise there is the risk that they will choose 
the same address; 


2. the migration of a group node of level / > 1 is not instantaneous: its single 
nodes change address one by one. Suppose that g decided to migrate because 
a condition @ was true. While the migration process runs, the condition Q 


28 


might become false and the remaining nodes in g will not have any reason 
for continuing the migration. For example, a condition @ can be Q =“the 
group G has become split”. 


Also, if g is full, it cannot accept new nodes and this is true also when it is 
migrating. However, since during the migration its size gradually decreases, 
a new node might believe that g is not full and may join to it. 


A solution to the above problems is to take one step further in viewing group nodes 
of level / as nodes belonging to groups of level / + 1. We define the memory of a 
gnode as the atomic distributed memory formed by its single nodes. 


There are different ways for forming an atomic distributed memory. A simple one 
is the following: let j1o(g) = ming be the single node in g with the lowest address. 
Then the memory of g is identified with the memory of ju9(g) and atomicity is 
achieved using simple locking mechanisms. Nodes in g are able to contact po(g) 
with the same greedy routing adopted by the HDHT, i.e. they will contact in 
order 4p-1(g), Mt—2(g), ---, Hol(g). However, this solution relies on a single node: 
when the address min g changes, the memory of g remains in an inconsistent state 
until all its nodes discover the new ming propagated by the routing updates. We 
suggest that at the cost of an increased communication cost, it should be possible to 
realize a fault tolerant atomic distributed memory by applying the Paxos consensus 
protocol|24] and the ideas presented in Etna[25]. We now briefly describe what 
its main components would be: j19(g) becomes the primary memory of g and it 
serves and serialize the read/write requests. k nodes of g are elected as replicas, 
each of them fetches and maintains a copy of the primary memory. In order to 
ensure a uniform spatial distribution, the replicas are selected across the hierarchy 
with the following function: 


bu(g,k) : 
If lvl(g) =0, return {g} 
else g is a gnode with elements g = {hi, sect Aig}, with hy < his1 Vt 
If k < |g|, return {~o(h1),..., Mo(he)}, where pio(h;) = minh; 
else let r= kmod|g|, d=k/|g| 
return p(hi,d+1)U... Up(hr,d +1) U pw(hrgi, dU... Upe(Ayg), d) 


Each time a read/write request is issued to the memory of g, the primary node 
verifies the memory consistency by querying the replicas with the Paxos protocol. 
If it receives more than (k+1)/2 ACKs from the replicas it will accept the request. 
Using a counter mechanism, in the case of concurrent writes only the most up to 
date write commits are accepted by the replicas. Every time the primary node 
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changes, i.e. {9(g) points to another node x, the Paxos protocol populates the 
primary memory by collecting from the majority of replicas the most up to date 
version. 


Finally, we describe how we can solve our serialization problems: 


1. a node x that wants to join to g = *.g,...gp—, by taking an address g = 
*.L1-1-9.---9_—1 Will try to set to 1 the x,_1-th bit of the memory of g;. Since 
its memory is atomic, no other node will be able to simultaneously set to 1 
the x,_;-th bit, and thus join to q. 


2. similarly as above, a bit is set to 1 until the condition Q is true. In this way, 
the nodes can atomically check if Q is true or not. 


4.2 Gnode Split 


Suppose a gnode h splits into connected components Hy,...,H,. Then in order 
to satisfy the connectivity constraint of group nodes, all but one Ay,..., Hm will 
have to change address. 

A node x € H; can recognize the splitting of h after the routing updates occur: 
there will be a missing path to reach a node of another component. Using the 
routing table the node x can deduce the number of nodes of its component. Also, 
x can know min H;, the smallest address of the nodes in H;. If x is a border node of 
h, it acts as follow: if min H; 4 minh, x leaves H; and migrate. After completing 
the migration, it also tells its old neighbors in H; to do the same. In this way, the 
only component that does not migrate is the one that contains minh. 


Notice that if H; does not contain any border node, then H; has been completely 
disconnected from the network. In this case, nodes in H; do not have to change 
addresses. 


We can optimize the above rule by minimizing the number of migrations, as follow: 
for each component Hj,..., Hm, the node min H; sends to a rendezvous node not 
in h the size |H;| along with its address min H;. The rendezvous node acts as a 
hub and forwards each message to all the nodes min H;, i = 1,...,m. The node 
min H; will in turn forward the received message to the border nodes of H;. In 
this way, the border nodes know the size of all components and the new condition 
for migration becomes min H; # min H;, where A, is the component such that 


|H;| = max |H;| 
1<i<m 


min H; > min H, Vk s.t. |H;,| = |A;| 
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In other words, H; is the component with the maximum size and with the maxi- 
mum min H; among the components of maximum size. 

It follows that the only component that will not migrate is one of those with the 
largest size. 


4.3 Network ID and Network Merging 


Up to now we have considered the whole network G as a connected network. How- 
ever, for a complete distributed implementation we have to consider the general 
case of a disconnected network. That is, G will be formed by the union of con- 
nected networks Gj,...,G,. Lets consider the case where m = 2. Since G1, G2 are 
disconnected, the nodes in each network have no way for coordinating the choice 
of their addresses, thus a node in G; can choose the same address of a node in Go. 
If later on G1, Gy become connected, then the resulting network G,;UG», will have 
an address collision. 

There are two ways for solving this problem: 


1. Use an out of band communication between G, and Gy for coordinating the 
address assignment. 


With this solution we are effectively creating virtual links between G, and 
G» and we can always assume that the original network is connected. 


2. Assign a unique ID to each distinct network. 


This solution can be viewed as extending the address y; ... yz—1 of each node 
to y,.--Yz-1-N, Where 7 is the network ID (netid). 


All the nodes with the same netid will form a connected network. In other 
words, the netid can be effectively viewed as a group node of level L. 


We now analyze in more details how to assign a network ID to each node. 


1. Each node v chooses a random number 7(v), with sufficiently many bits 
such the probability that two nodes have the same 79(v) is negligible. Notice 
that the netid will not be used for routing purposes and thus the number of 
bits can be chosen more freely. 


2. The network ID of a connected network G is then 


(G1) = min 7(v) 


veEG, 


dl 


The nodes of G, can know 7(G;) in a single flooding round: at first, each 
node v sets (G1) := No(v), secondly it broadcasts its current known 7(G_) 
to its neighbors. After a node w receives a broadcast 7, it sets 7(G1) := 
min {7(Gi),7}. If (G1) has changed, w retransmits it to its other neighbors. 


3. With the above procedure, if G; is a connected network, then all its nodes 
will agree on a unique 7(G,). Suppose now the two networks G, and G» 
become connected. Suppose also that the nodes know an estimate of the 
size of their original network. Then instead of choosing min {7(G1), 7(G2)}, 
the nodes will prefer the netid of the smaller network. In this way, the 
number of flooded nodes will be minimized. In particular, this is a necessary 
optimization when G is formed by a single node 2, i.e. when x joins the 
network Gp. 


4. Suppose that the node x € G with the minimum netid 7o(2), ie. o(x) = 
n(G), leaves the network. A node will participate in a new round of netid- 
discovery, only after it receives the routing update regarding the departure 
of 19x). 


This rule handles the case when the network G becomes disconnected into 
components G1,...,Gm, due to a link or node failure. The nodes in the 
components where x is missing will change their netid, so that at the end of 
the netid-discovery each network G; will have a distinct netid. 


The above solution presents a drawback: when the node x with n(x) = n(G) 
dies, then a new netid-discovery round will occur and the entire network will be 
flooded. By assuming that nodes can synchronize their clocks, we can damp this 
problem with the following heuristic: the node with the highest uptime will be the 
least likely to leave the network. To apply the heuristic, it suffices to change the 
definition of j9(x) as follow: 


No(x) = (Time when x has been turned on, Random Number) 


two pairs (t,7r), (t’,r’) will be compared using the lexicographic order. 


Finally, suppose that two separated networks G , Gz become linked and that Gj 
is the one that will change netid. The two networks have also to resolve all the 
address collisions. This is possible because the nodes bridging the two networks 
will exchange their routing table. If they notice that two gnodes of level L — 1 
have the same address, then the one in G, will be alerted and its nodes will start 
to migrate. 
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Chapter 5 


The Cost of Balance 


The main cost associated to the balancing rules is the number of migrations that 
occur while the network evolves. 

Each time a single node migrates, it has to advertise its new address by updating 
the name— address mapping stored in the HDHT. Moreover, if the node is storing 
some mappings of the HDHT, it has to transfer them to the most appropriate 
node. Assuming that the HDHT equally distributes the mappings and that at 
most a constant number of them are registered by each node, a location update 
requires the transfer of O(log N) stored mappings!. Thus if M is the total number 
of migrations, O(M log N) mappings will be transferred. 

Also, the change of the routing address of a node affects higher layer transmission 
protocols: a TCP connection between a node and a migrating node will break. 
Thus additional countermeasures such as virtual circuits are required. 

Finally, when a node x migrates from a group g to a group h, new routing updates 
are necessary for discovering the paths connecting x to the other nodes of h. 
Depending on the routing protocol implementation, it may be necessary to update 
the routing tables of the higher levels, for example when «x affects the route stretch 


introduced by the groups g and h. 


For all the above reasons, our ultimate aim is to understand what is the behav- 


ior of the balancing rules when the network evolves and estimate the number of 


‘the log N term derives from the extended HDHT where a same mapping is stored in L = 
logs N nodes 


33 


migrations. 


Remark 5.0.1. It is possible to trade the handoff overhead caused by migrations 
of high level gnodes with an increased communication cost for the HDHT. The 
read/write procedure of the HDHT becomes the following: a node x = x... ©, 4 
stores its name + address mapping only to the nodes with address h,;(name) = 
ho... hj-Uj41---2p-1 Wt < 1, where 1 < L—1. The mapping stored at node 
h, (name) points to the partial address of the form ap...2;. In this way, x does 
not need to update the mapping when one of its higher gnodes *.7;...2,~1, with 
i > 1, migrates. Thus, the S'+’ mapping updates required by the migration of a 
full gnode of level |] + h are saved. However, in order to retrieve the address of x, 
a node y that does not belong to *.a,...x,_1 will have to query all the possible 
nodes with an address of the form ho... hi_1.* 


We now give some results on the performance of the balancing rules. Then we will 
proceed in an experimental evaluation through simulation. 


5.1 Number of Migrations 


Remark 5.1.1. In the case of node removal the PB-rule induces new migrations 
when a gnode becomes too small in comparison with other gnodes, i.e. when the 
condition of Proposition 3.3.4 does not hold anymore. Instead, the LM-rule does 
not force any migration when the size of a gnode decreases. 


We now examine what happens when new nodes are added in the network. 
Proposition 5.1.2. Suppose that all the gnodes in |G], have the same size a, with 
S—1>a>1. Suppose s nodes join to a gnode g € [G];, where s < (S —a)N, 
N = |[GIi. 


The number of migrations caused by a balancing rule (PB or LM rule) is: 
& m . m! 
M(s) = [=] 0G - Di + 2G - DD) + m'r’ 
; = 


j=l 
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where 


m’ = max {wm | (smod D) Sp, oh, vs (smod D) Sp. 
k=1 


D=S—D,=S0N, N =|T(g)| = |[Gi 
i=l 
m=maxd(g,y), where d(x,y) is the hop distance 


ye[G]i 
D;, = Soli, BE, =|{h © [Gh | d(g,h) =i} | 
1 PB-rule 
So — 
. —a LM-rule 


For the LM-rule the formula becomes simpler: 


/ 


Mrm(s) = (j —1)D;+m'r’ 
j=l 
m’ = max nla Sov.za}, r=s— S D; 
k=1 k=1 
(S —a)N 


Notice that the LM formula does not depend on m, while the PB formula does. 


Proof: Let’s fix Sp = 1 and consider the PB-rule first. 
By Rem. [3.3.2,pg.24], each time one node is added in g, a shortest migration path 
is selected, i.e. for each addition 1 <7 < s we have a path 


Pe ams 
length(P') = 
pPi=g Vi 
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We prove by induction on 7 that if i < D, then 
Li>1sSsm_i <m<m_ji+1 
2: 
2.0 pi=g 
21, |p =a+i V2<9 om, =1 
2.2 after the migrations of P’, we have |g] =a+1 
3. Before the i-th node is added, we have 
3.0 |pin| = @ 
after, we have: 
3.1 |p, |=at1 
3.2 f,: {P*¥ | k <i} — {he [GI], | |k| = a+ 1} is a bijection: 
PUD aati he 
3.3 i=|{P¥| k<a}|=|{h © [Gh | |b —a+1}| 
4. M' =m, —1 is the number of migrations caused by the addition of the i-th node 


If 2 = 1, the new size of g is a+1. The size of all the other gnodes is still a. Thus, 
P! = (g,) and mj = 1, M'=0. Also, 3. is true. 
Consider the case i+ 1. Let P’*! be any shortest migration path. By induction, 
before adding the new 7+ 1-th node, the size of a gnode is either a or a+ 1 and 
lg] =a+l. 
By definition of migration path, we have: 

py =g, |g| =at2 
pel =at1 Vi < my -1 
ule (0) 
Such a P**! exists: if, by contradiction, all the nodes have size a+1, then each of 
them has been already reached by a migration path, i.e. by 3. we have i = |[G\/], 
but 


eel x D=>i<|[Gi|-1 
ww 
by hypothesis 
which contradicts i = |[G]]|. 
By 2.,3.0,3.1 and (0), the gnode BG cannot be any of the gnodes reached by 
previous migration paths, so 
|{Pe | k<itih|=|{P¥ | k<ih}|4+1=i+1 
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After adding the new node, the migrations occur and we have: 
pit =g, |gl=a+l 


pe | =at+1 V7 Ss meg = 1. 


le = aart 
At least, mii; — 1 migrations have occurred to let one node migrate from p;,** to 


4) Wj < mii — 1. Finally, by Prop. 3.2.11, these are the only migrations that 


Pj44 
have occurred, thus no other gnode apart from ibaa has increased its size. Thus, 


2.,3.,4. are true. We now prove that also 1. is érue, 


pitt il= a+1 > i! ae Ping = Praia 1) with 7 <i 
3.2 
P't! is a shortest migr. path > My 2 igs = 1 


mj = My > Ma — 1S m+ 1S mei 
Sw 


j<i, inductive Hp 
It is impossible that m;,, < m,, otherwise P’ is not a shortest migr. path => 
>mM,SMis1 


If D>i> S~4_, D; the number of migration paths of length 7 < q is equal to 
j=l 7) 


let f be the restriction of f; to wee |e 4, Hip 7} 


73 ips |e ay “ite =g} — D; is a bijection: 
lg) =at+1, Vhe [Gh a< |h|<a+1 
PF is a shortest migration path 
= P* is a shortest path = f(P*) =pk,, € Dn, =D; 


f is injective > f injective 


qd 
from 2 > S- D, and 1.,3.3, it follows that f is surjective 


j=l 
By 4., the number of migrations after the s-th node has been added is: 


M(s) = )o(m:-1) (5) 


i=1 
If s < D, with (1) we can collect the terms in (5) by length: 


Sj -0D, + ((m' +1) = 1D)r' 
j=l 


soc anes Sn) r=s— 5° D, 
k=1 k=1 
pe ey j —1)D; are the migrations required to increase the size of all the gnodes 
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reachable in at most m’ hops. Each time we add a node in one of these gnodes, 
we are using a path of length j and thus the required migrations are 7 — 1. r’ is 
the number of gnodes reachable in m’/ + 1 hops that we can still enlarge. 

Finally, notice that if s = D = |[G\il, 


s=D=S_ Dy 
k=1 


m' =m = max length(Path((G],)), r’ = 0 
M(s) = )\(j -1)D; 
j=l 
and, after the migrations, all the gnodes have size a+ 1. Thus, the same initial 
situation has been reached again. In the case of any new addition, we can reapply 
the same reasoning. So, if s = gD is a multiple of D: 


M(s) = 4G - DD; 


In general, considering the remainder r of s/D, we have: 


M(s) = [=] 0G - Dj) + 2G - DD; + (mn! +) - TY’ 


j=l j=l 


a 
where m’ = max { |r— S_D; 2 | 
k=1 
ae 
i SS" Dy 
k=1 


The formula for the LM-rule, can be derived with a similar reasoning. The main 
difference is that we can think of adding S — a nodes each time. Another way 
to derive it is to notice that the LM-rule is equivalent to an application of the 
PB-rule on the graph obtained from [G]; by substituting a gnode h ¥ g with S—a 
identical copies (i.e. nodes with the same edges of h). In the new graph, E?°™ is 
equal to (S — a) Ee Yi=1,...,m. 

Finally, the simplification of the LM formula is derived as follow: we cannot add 
more than (5 — a)|[G],| = (S — a)N = D nodes, thus we have s < D, and 

[3/ D) = 0; 7s 


M(s) = S0(j — 1)D; + m'r’ 
j= 
m = asm 1s Sop.zol, r=s— SD, 
k=1 k=1 
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Corollary 5.1.3. Consider two different address assignments such that |G], and 
[G], are two different graphs. Suppose also that all the gnodes, both in [G], and in 


[G];, have the same number of nodes a, with S—1>a>1. As in Prop. 5.1.2, 


add s nodes in a gnode g € |G]; and in a gnode g € |G],. Let M(s) be the number 
of migrations induced in |G]; and M(s) those in [G\:. _ 
If in |G, there are more shortest paths than in |G];, we have M(s) < M(s). More 


precisely, using the same notations of Prop. 5.1.2, suppose 


j 
h; € Z, So hi >0 Vj < max{m,m} 
i=l 


N=N 


then 


M(s) < M(s) 


In other words, if |G], is more connected than [G],, it will induce less migrations. 


The intuitive explanation is that the migration paths in |G], are shorter and cause 
less migrations than the equivalent paths in |G). 


Proof: Recall from the previous proof that if s < N, 
M(s) = So(m- 1) 
i=l 
length(P') = mj < mii <m;+1 Vi=1,...,s—1 
Since s < N = N, the same is true for M@ and ™,. 


By induction on 2, we prove that m; > mj; Vi =1,...,N. This suffices to show 
that M(s) < M(s), also when s > N. 
Base case: 


= _ _— 
Pr =(g;), P (g, ) > My Z my 
Suppose m; > ™; is true. Since Mm; < Mizz < mM; + 1 we have two cases 
CASE: M41 = ™ 


Mi = ™M LMS Mizi 
CASE: Migt = M+ 1 
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Since 1 = my <™g2 <--: < my and mM <_™m; +1, we have 


where 7; =| {k < N | m= J} 


and analogously, 
My -1 


i= S- V; + p(mj,t) 
j=l 
where vj =| {k <N | my = j} 
Notice that 


p(q,i) =| {k <i|™ =a} | 


p(g,i) =| {k <i] me = a} | 


Ys = plmrisi) 
Recall that the number of all migration paths of length 7 is equal to the 
number of gnodes of [G]; reachable in j hops: 
Vv; = EB; 
Analogously for G: 
0,;=E;=E;+h;=vjt+h; 
We are under the hypothesis m7 = mj + 1, thus 


mizi—l 


it1l= S- D; + p(Ma,i+1) = S07; + D(a,ti+1) > yy = 
i=} j=l poi j=l 
=Sye Dae 
j=l j=l j=l 
Sa 
>0 
Miv1 > mM; > ™M;, and it’s impossible that m,;,, = ™; in fact: 
mi Mi+1 mi+i-l 
a+1 > Soy; = So 3 > > V3 + p(mii1,¢ +1) =i+1 
(x) j=l mipi=m j=l j=l 


a2+1>%+4+1 contradiction 


thus m1 >™m+1=mni 


Corollary 5.1.4. When adding s nodes in [G],, the maximum number of migra- 
tions is reached when |G]; is a linear path and all the nodes are added at one of 
the two extremes. This is true both for the LM-rule and for the PB-rule. 


Proof: Corollary 5.1.3 


Proposition 5.1.5. Under the hypotheses of Prop. 5.1.2, suppose that [G]; is a 
linear path and g is one of the two extremes. Then, the number of migrations 


40 


caused by the PB rule is: 


s)N(N-1). r(r—-1) 
a 2 Es 2 
r = smodN 


M(s) = | 


and by the LM rule is: 


m’(m! — 1) 


Mym(s) = So — 4 + mr’ 
fae | So) i ge dS 

n= s1° "= smod Sp 

So =S$ a 


"PB-rule 


LM-rule 


M(s) 


Figure 5.1: Number of migrations of the LM-rule and the PB-rule when [G], is a 
linear path (Proposition 5.1.5). The parameters are N = 20,5 = 10,a =1. 


Proof: This follows directly from Prop. 5.1.2, because [G]; is a linear path of N 
nodes and we have 


m= N 
D; = So-1 Vi 
D=S)N 


In the case of the PB-rule we have Sp = 1 and 
m =max{m' |r—m'>0}=r, r=r—-r=0 
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Instead, for the LM-rule, 


m' = max {m’ | s —m’Sp > 0} = =| , rf =s—m’So = (smod So) 
0 


So=S-a 


5.2 Simulation 


The scenario of the simulation is based on a city wireless mesh network where each 
node is located in a house and the density allows the coverage of the whole city, 
i.e. the resulting network graph is connected. 


Random Geometric Graphs (RGG) are the simplest model of wireless mesh net- 
works. In a RGG the nodes are uniformly distributed in the unit square [0, 1]? and 
a link is established between two nodes 7, 9 if d(Z,¥) < ro. 

However, the geometric topology of a city wireless network is very different from 
the unit square. If a node places its antenna behind a window, then at least half 
of its coverage is shadowed by the building. Also, the height of buildings may vary 
and antennas placed on the roofs may be shadowed by taller buildings. 

In order to simplify the matter, while still taking into account the obstacles of a 
city wireless network, we will assume that the links of a RGG are removed with a 
fixed probability 1—p,. We will call the resulting graph a City Random Geometric 
Graph (CRGG). 

If p. = 1, a CRGG is still a RGG, instead if p. < 1, we can approximate it with a 
classic random graph where an edge is established between two nodes with a fixed 
probability p. For this reason, in our simulations we will resort to random graphs 
only. 

Proposition 5.2.1. Let G be a CRGG with radius ro and link probability p., then 


1. Ifpn = 1, G ts a EGG 


2. If pe < 1 and we ignore nodes near the border of the unit square, G is a 
random graph with link probability per? 
Proof: 
P(ay are linked ) = P(xy | d(z,y) < ro)P(d(x,y) < ro) = pe Area(I;)(a)) = per r9 


Note: a further generalization of CRGGs are the Random Distance Graphs studied 
in [28]. 


Notice that the assumption of having a constant probability of link establishment 


42 


is justified by the fact that wireless links between fixed nodes can become broken 
only due to new obstacles and in a city we can consider this an infrequent event. 
As a consequence, once the network has been formed, nodes addition and removal 
will be the only network events occurring in the network. In other words, at a 
given time a node can be either On or Off: 


1 node z is On at time t 
Ag(t) = 
0 else 


Under the above assumptions, we performed two different experiments. 


5.2.1 First Experiment 


For comparing various properties of the balancing rules we considered a simple 
network evolution that induce all kinds of migrations, namely migrations caused 
by splits, by network merging and by new joining nodes. 


In details, the first experiment runs as follow: 


1. A connected random graph G is generated with link probability p and Ninax 
nodes. We ensure that G is connected by repeatedly generating p-random 
graphs with the NetworkX Python library[30] until a connected graph is 
found. A connected random graph will be found quickly if p is greater than 
the critical threshold 1/N of random graphs. In fact, with high probability 
the size of the giant component of a random graph increases exponentially 
as p > 1/N increases and it becomes the full graph when p > In(.N)/N [27], 
[29]. 


2. Only the first level of the hierarchy is considered and the maximum number 
of level 1 gnodes is set to ceil(Nmax/S), where S is the gnode size. 


3. At the start of the simulations, all the nodes are turned Off. 


4. Sequentially, each node is turned On, the balancing rules are executed and 
the number of migrations is counted. 


5. When the number of On-nodes is Ninax, i-e. all the nodes joined the network, 
we start the turn-Off phase: at each step we turn Off a node and count the 
number of migrations. 


In figure 5.2, we can see how a simulation of the first experiment evolves. 
The network starts with 0 nodes, at each step a new node is added. On the Y 
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axis is reported the total number of migrations. When at step 1000 the network 
becomes full, the simulation proceeds by removing a node at each step. 


Total Number of Migrations 


0 200 400 600 800 1000 «©1200 «6.1400 §=61600 )§=6961800 )§=—2000 
Steps 


Figure 5.2: Evolution of the first experiment, with parameters Nmax = 1000, 
Nmin = 1, p= C/N S = 20. 


Migrations and Link Probability 


In another simulation we varied the probability of link formation p = d/Nyax, 
where d is the average degree of the graph. As shown in the figure 5.3, as p in- 
creases, the number of migrations decreases. This is not surprising: as p increases, 
nodes are more connected. A joining node is then linked to a greater number of 
groups and it has a greater chance of finding a non-full group as a neighbor. Thus, 
less migrations occur because the LM-rule enforces a migration only when a node 
as no other choice than to join into a full group. 

Secondly, since it becomes more difficult to disconnect a group with a node re- 
moval, the number of migrations due to a gnode split decreases too. 


For these reasons, in all successive simulations we preferred to study networks with 
low connectivity, by choosing 2/N < p < In(N)/N. 
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Figure 5.3: The number of migrations as a function of the probability of link 
formation of the generated random graphs. The parameters for this simulation 
ate N pice = 100, b= 6, Nike = 1, 


PB and LM Rules Comparison 


For comparing the PB-rule against the LM-rule, we repeatedly run the first ex- 
periment with parameters in the ranges 80 < Ninax < 100, 3/Nmax < p < 4/Nmax; 
6 <5 =< 20. 


For each generated graph, we used both the PB-rule and the LM-rule. In all 
simulations, the number of migrations caused by the LM-rule (M;,,) was never 
greater or equal than those caused by the PB-rule (pz) and on average, Mr 
was 45+ 11% less than Mpz. 


These results show that the LM-rule is more efficient than the PB-rule. Note: in 
all subsequents simulations we used the LM-rule only . 


Aggregative and Dispersive Rules Comparison 


Similarly as before, we compared the Aggregative-rule (A-rule) with the Dispersive- 
rule (D-rule) (see section 3.3.7). As we expected, the A-rule is more costly. On 
average, the D-rule generates 37 + 24% less migrations of the A-rule and only in 
8% of all simulations, the D-rule caused more migrations than the A-rule. 
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Group Size 


The maximum size S' of a group node is a tradeoff parameter between the size 
of the routing table and the latency stretch, but it also influences the number of 
migrations M. 

From the left figure 5.4, we can observe that when the size of a group becomes 
larger than half of the number of nodes (100), the migrations begin to decrease. In 
fact, it becomes increasingly difficult to disconnect a group due to a node removal 
(we will explain why in the next section). Also, the migration paths decrease in 
length. 
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Figure 5.4: On the left, the plot represents the number of migrations as a function 
of the group size. On the right, the same data is reported, but the migrations due 
to network merging are also counted. The parameters for this simulation are: 
Neo = 200; Noein = 1, 9 =4 Nee: 


However, if we also take into account the migrations due to network merging 
(Chapter 4.3), M reaches a minimum when the group size is 2 and then increases 
until a point where it becomes almost constant (right figure 5.4). 


Small values of S have a drawback: the number of established virtual addresses 
becomes higher when S is small. This is shown in figure 5.5. 


Groups are Random Graphs 


We repeated the simulation for 5000 times by varying the group size S and counted 
how many edges FE’ were contained in a full group node at the end of the network 
formation. In figure 5.6, we can see the edge ratio p/ = ~ plotted as a function 


(2) 
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Virtual Addresses 


Size of a group 


Figure 5.5: The number of established virtual addresses depends on the size of the 
group nodes. Smaller sizes force the creation of more virtual addresses. 


of S. As S becomes larger, p’ tends to the probability p of link formation of the 
network graph. For S > 20, the relative error on all points is < 1.5%, while on 
70% of points it is < 0.25%. 


Since the error is small, p’ depends mainly on the size S and not on the underlying 
process that generates the groups. For this reason, we can consider large groups 
of size S as random graphs G'g,p//s). 


We also notice that for all points p’ > p, p'S < pN (except 5 points) and p’S' has 
an increasing behavior as S grows. This can explain why larger groups are harder 
to split: when h nodes are removed with a fixed probability from a random graph 
Gg», the remaining nodes form a G'g_;,,, graph and the expected size of its giant 
connected component monotonically depends on its average degree p/(S — h). 


Virtual Addresses and Average Path Length 


By considering all the data acquired from simulations, we counted how many 
migrations force the creation of virtual addresses: of 157-10° simulated migrations 
only 0.87% established new virtual addresses. We also observed that the average 
migration path length is short: on average it is 2.00 + 0.03 hops long. This is 
not surprising: the average path length in random graphs is ~ InN, thus only 
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Figure 5.6: Each point indicates the edge ratio of a full group with size S. The 
parameters of the network graph are Ninax = 200, p = 3/Nmax. The bottom 
constant line is y = p. 


with large scale simulations it should be possible to notice higher migration path 
lengths. 


5.2.2 Second Experiment 


In the second experiment, we studied the behavior of the balancing rules as the 
network becomes more and more chaotic. Niji, nodes are considered stable, while 
the other Nimax — Nmin nodes are marked as churn nodes. During the simulation 
the churn nodes are turned on and off with a fixed probability and the migrations 
are counted. In details, 


1. A connected random graph G is generated as in the first experiment. 


2. Nmin nodes are turned on and the balancing rules are applied until the net- 
work reaches a stable configuration. The migrations of this step are not 
counted. 


3. For MazSteps times the following procedure is repeated: each churn node is 
turned off with probability 1 — p, or on with probability p.. The LM-rule is 
applied and the number of migrations are counted. 
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Nmin and p, affect directly the number of migrations. As shown in figure 5.7, 
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Figure 5.7: At each simulation step, each of the Ninax —Nmin churn nodes is turned 
on and off with probability Pr Churn. The gray intensity indicates the average 
number of migrations per step (Migrations / MaxSteps) obtained at the end of the 
simulation. The parameters used for this simulation are: Nmax = 200, S = 20, p = 
AN pisses 


the migrations increase when p, gets near 1/2 and Nin decreases. In fact, Nmin 
determines the percentage of churn nodes in the network. Also, the probability 
of having k alive churn nodes at each step follows a binomial distribution and the 
expected number of churn nodes that will change their status from one step to the 
next is: 


Nmax—Nmin 


E(A;) =E ( >» 1) = 2(Ninax = Nymin)Po(1 — De) 


z=1 
where Ip, = [Att — r¢ |, 
MN =1 node z is On at time ¢ 
PUg=1)=] POL = 1, AS =0) + PU HOA = 1) S23) 


Thus at p. = 1/2 and Ninin = 0, there is the maximum expected number of On/Off 
switches. 


M as a function of Ninax 
We repeated the experiment by varying the number of nodes Nmax and by setting 


Nin = Nmax/k, for a fixed k > 0. Notice that in this way the ratio of churn 
nodes is equal to (Nmax — Nmin)/Nmax = 1—1/k. As shown in figure 5.8, M 
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Average Migrations per Churn 


Figure 5.8: Each point represents the average number of migrations M per step of 
a network with Nmax nodes. Each line is obtained by fixing a different Ninax/Nmin 
ratio. The parameters used for the simulation are: p = (In(Nmax) —1)/Nmax, S = 
20, MaxSteps = 100, 


increases almost linearly as Nmax grows. This is also evident from their correlation 
coefficient (see table below), which is almost 1. 


= Waa) Vinin | 2|3 | 4 [5 |6 [7 | 8 | 9 | 10 
Correlation Coefficient | 0.90 | 0.94 | 0.97 | 0.97 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 
Linear Best Fit Error | 3.01 | 4.68 | 4.68 | 4.45 | 4.00 | 3.23 | 3.16 | 3.06 | 2.93 


M as a function of E(A;) 


In a next simulation, we calculated the average number of migrations M as a 
function of the expected number of network changes E(A,;). By choosing p, = 1/2, 
E(A;) becomes (Ninax — Nmin)/2 and we can vary it linearly by changing Nimin. 


In the figure 5.9, we can observe that M increases sublinearly as the number of 
stable nodes decreases. In other words, the addition of a new churn node influences 
little the network. 


When Ninin > 240 (< 20% of churn nodes), the standard deviation of each data 
point is < 2. In this case, M depends lightly on the underlying graph and on the 
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Figure 5.9: Average number of migrations as a function of the expected number 
of network changes E(A;). The parameters used for the simulation are: Ninax = 
300, S = 20, pp = 1/2, MaxSteps = 100 


particular nodes that switch their On/Off status. Instead, when the percentage of 
churn nodes grows, the variance of M grows too. In other words, with high churn 
the peculiar features of the underlying graph begin to matter more. 


Gnode Split Migrations 


There are two types of migrations: those caused by new nodes joining the network 
and those caused by gnode splits induced by old nodes that leave the network. 
Intuitively, the latter type is more costly: the number of migrations caused by a 
new node is equal to the length of a migration path, which on average is —as we 
have seen— rather short. Instead, a gnode split can force the migration of a large 
part of a group. 

In a next simulation, we counted the number of migrations occurring due to a 
gnode split (Ms). In figure 5.10, we can see that when Nin < 250 (> 17% of 
churn nodes), more than 60% of migrations are due to gnode splits only. When 
Nmin > 200, even though the average percentage Ms5/M decreases, the variance 
increases notably. 


51 


Percentage of Migrations by Split 
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Figure 5.10: The Y axis represents the percentage of migrations due to gnode 
splits: “1s 100, where M is the number of migrations per step. The parameters 
used for the simulation are: Nmax = 300, S = 20, pp = 1/2, MaxSteps = 100 


5.3. Bounds on the Number of Migrations 


In the previous simulations, we have considered a network subdivided in ceil(V/S) 
groups of single nodes, ignoring the borders formed by levels higher than 1. This 
gives an under-estimate of the actual number of migrations: 

Proposition 5.3.1. Let M be the total number of migrations occurring in a net- 
work after a sequence of events, using either the PB-rule or the LM-rule. Under 
the same events, consider an application of the chosen balancing rule to [G], only, 
or in other words let only single nodes migrate. Let M’ be the relative number of 
migrations. 

We have: 


M'<M 


Proof: We omit the proof for space constraints. The main idea is to notice that, 
without the higher levels, nodes have more freedom for creating group nodes and 
there are no higher gnodes can become split. 


Proposition 5.3.2. Consider a hierarchical network G with L levels, groups of 
size S and Nmax = S”. Let M be the number of migrations caused by some network 
events occurred at the same instant. 
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M is upper bounded as follow: 
M < 26DNmax 
where 0 1s the diameter of the network after the events. 
If G is a random graph Gn,» and p> 1/Nmax, on average we have 


2 
In“ (Nmax) Noo 
In(pNmax) n(S) 


M<2 


Additionally, if pNmax > € and S > e, we have: 
M = O(Nmnax In?(Nimax)) 


Proof: The balancing rules are applied at all levels, thus 


L-1 
M= DMs 


M, = Mji.+ Ms; 

where M; is the number of single nodes that migrate for balancing [G];, Mj; 
are the migrations caused by new nodes joining the network, while Ms; are those 
caused by gnode splits. 
Recall that for each migration of a node a migration-path P is established and the 
number of migrations is length(P) — 1. 
If P is a path in [G];, with 7 > 1, its length is upper bounded by the diameter of 
[G];. Also, [|G]; can be viewed as a contraction of the original graph G, so 

length(P) < 4((G],) < 6(G) (1) 


If at level 7 there is still a non-full gnode g (Ivl(g) = 7, |g| < S), anew joining single 
node will cause a migration path in [G];. When [G]; becomes full, ie. |[G],| = S’~, 
any further addition of nodes will cause migration paths in lower levels only. Thus, 
at each level i at most S’~* migration paths can be established. Each migrating 
group has at most $* single nodes, thus by (1) we have 

Mya S0S SS =ONa Vises bad 


L-1 
My = 9) My; < (L-1)8Ninax 
$1 


When a gnode g € [G]; is split all the single nodes not belonging to the largest 
component will migrate. The maximum number of single nodes of g is S$’. In the 
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worst case, [G]; is full and all its groups are split, thus 
Msg4 < S*\[G]i|]6 < 99S" "5 =6Nmax Wisi<L-1 


D-1 
Ms = 5_ Mg, < (L-1)8Ninax 
i=1 


In sum we have: 
M=M,+ Mg < 26LNmax 
Finally, the expected diameter of a random graph Gy,,,,» is[29]: 
5 - MU) 
Nimmax;P = In(pNmax) 
we have 
In? (Nie) 
In(pNmax) n(S) 


and since L = In(Nmax)/In(S), 
M<2 


max 


Remark 5.3.3. The results of the second experiment (section 5.2.2) suggest that, if 
S is fixed, in a network with a constant churn the number of level 1 migrations is 
Q(Nmax). Therefore, using proposition Proposition 5.3.1 we may conjecture that 
in general M = Q(Nypax). Finally, by the previous proposition we can conclude 


that for connected random graphs: 


O.(Nmax) =M= OWN mee) 
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Chapter 6 


Conclusion and Future Research 


A hierarchical topology simplifies routing and name management: the routing 
tables are small and distributed hash tables can be overlaid in a natural way. 
However, when the network is dynamic, the task of constantly maintaining a co- 
herent hierarchy becomes complicated. Not only nodes have to change addresses 
when a group becomes disconnected, but also additional measures are required 
when new nodes enter the network. In this dissertation, we have studied what are 
the necessary rules that have to be enforced in order to incrementally update the 
hierarchy and what protocols can realize their distributed implementation. We 
have analyzed the behavior of the rules under different network conditions. As 
expected, well connected and not very dynamic networks are easy to manage. In 
the worst case, the number of address changes is upper-bounded by O(N ), where 
N is the size of the network. However, with a constant churn, the number of 


migrations increases at least linearly as N grows. 


There are two directions for continuing the research on scalable mesh networks. 
In the first one, we can try to optimize the hierarchical architecture by exploring 
various tradeoffs. For example, as explained in Remark 5.0.1, the handoff overhead 
caused by migrations can be reduced at the cost of increasing the communication 
cost of the HDHT. Another tradeoff arises while trying to minimize the latency 
stretch and the number of migrations: groups may be formed in such a way to 
minimize the introduced latency stretch. However, the formation of group nodes 


becomes dependent not only on the connectivity properties of the nodes, but also 
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on the weight of their links. Thus, when the links’ weight changes new migrations 


may be required. 


The second direction is to relax the connectivity constraint: instead of requiring a 
group node to be internally connected through physical paths, we allow the use of 
virtual circuits, or in another words, group nodes are created on a routing overlay 
imposed on the physical network. Relaxing the connectivity constraint simplifies 
many dynamics of the topology maintenance. However, if the overlay is not care- 
fully constructed, the latency stretch may increase rapidly. 

The approach of creating unconnected group nodes that minimize the latency 
stretch is strictly related to a current research field called Name-independent Com- 
pact Routing|31][32].. The Compact Routing problem is to construct a routing 
scheme that minimize both the latency stretch and the routing tables size. 

One scheme that is similar to our relaxed hierarchical architecture is “Generalized 
routing scheme for O(N‘/?) space”, presented in [34]. 

We note, however, that the current compact routing algorithms do not solve the 
original objective of constructing self-configuring scalable mesh networks. In fact, 
they assume that the network is static. The design of an efficient dynamic scheme 
that is updated incrementally as the network evolves is currently an open problem, 
even though some lower bounds for the involved communication cost are already 
known[36],[37]. 
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